% please textwrap! It helps svn not have conflicts across a multitude of lines.
%
% vim:set textwidth=78:

\begin{abstract}

In this assignment we use Rapidminer 5.1 to explore techniques of
preprocessing data and compare the performance of linear support vector machine (SVM) 
with Naive Bayes learner to accurately predict binary features for the rare class. 
We demonstrate our techniques in modeling the response to the 
97NK mailing from the KDD Cup 1998 training data using Fast Large Margin operator for 
linear SVM and Naive Bayes learner available in Rapidminer 5.1. We use AUC as the performance comparison metric.
We preprocess and use all $95,412$ examples and select features for both models. 
After applying our techniques, we create models with AUC of \optauc{} and
\optaucb{} for our linear and Naive Bayes learners respectively.
Since we have limited memory, we use only $10$ of $480$ features in our
final models. However, we observe that using more features improves AUC on Naive Bayes and we hypothesize that the 
same will hold true on linear SVM as well. 

\end{abstract}

